first-order regret bound
Improved Algorithms for Online Submodular Maximization via First-order Regret Bounds
We consider the problem of nonnegative submodular maximization in the online setting. At time step t, an algorithm selects a set S t. The goal is to design an efficient algorithm for minimizing the expected approximate regret. In this work, we give a general approach for improving regret bounds in online submodular maximization by exploiting "first-order" regret bounds for online linear optimization.
Review for NeurIPS paper: Improved Algorithms for Online Submodular Maximization via First-order Regret Bounds
Weaknesses: This paper has a few weaknesses which are as follows: - Although the authors have cited [9] and [10] as previous literature on online submodular maximization, they have not compared their obtained regret bounds with the bounds of [9] and [10]. These two papers consider the online continuous submodular maximization problem (as opposed to submodular set function maximization in this work) and may seem different in the first look, however, since the multilinear extension of submodular set functions are continuous submodular, combination of the continuous algorithms in [9] and [10] and a lossless rounding technique such as pipage rounding could be used for discrete problems. In particular, the algorithms of [9] and [10] are almost identical to Algorithm 1 of this paper and the only novelty of Algorithm 1 is the use of g f-l (as opposed to f) as the utility function which leads to \alpha-regret bounds with curvature-dependent approximation ratio \alpha. I noticed that this issue has been addressed in more detail in the appendix, however, I think it's important to discuss the computational complexity of the discretized version of Algorithm 1 in the paper as well. In particular, implementing this algorithm on a real-world problem and comparing the performance for different discretization levels and various number of samples could have made the paper even better.
Improved Algorithms for Online Submodular Maximization via First-order Regret Bounds
We consider the problem of nonnegative submodular maximization in the online setting. At time step t, an algorithm selects a set St C 2 V where C is a feasible family of sets. An adversary then reveals a submodular function ft. The goal is to design an efficient algorithm for minimizing the expected approximate regret. In this work, we give a general approach for improving regret bounds in online submodular maximization by exploiting "first-order" regret bounds for online linear optimization.
First-order regret bounds for combinatorial semi-bandits
We consider the problem of online combinatorial optimization under semi-bandit feedback, where a learner has to repeatedly pick actions from a combinatorial decision set in order to minimize the total losses associated with its decisions. After making each decision, the learner observes the losses associated with its action, but not other losses. For this problem, there are several learning algorithms that guarantee that the learner's expected regret grows as $\widetilde{O}(\sqrt{T})$ with the number of rounds $T$. In this paper, we propose an algorithm that improves this scaling to $\widetilde{O}(\sqrt{{L_T^*}})$, where $L_T^*$ is the total loss of the best action. Our algorithm is among the first to achieve such guarantees in a partial-feedback scheme, and the first one to do so in a combinatorial setting.
- Europe > Poland (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France (0.04)